NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Mall, Utkarsh; Phoo, Cheng Perng; Chiquier, Mia; Hariharan, Bharath; Bala, Kavita; Vondrick, Carl (June 2025, CVPR)

Free, publicly-accessible full text available June 10, 2026
Scale-aware Recognition in Satellite Images under Resource Constraints

Revankar, Shreelekha; Phoo, Cheng Perng; Mall, Utkarsh; Hariharan, Bharath; Bala, Kavita (April 2025, ICLR)

Free, publicly-accessible full text available April 24, 2026
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery

Zhou, Hangyu; Kao, Chia-Hsiang; Phoo, Cheng Perng; Mall, Utkarsh; Hariharan, Bharath; Bala, Kavita (December 2024, NeurIPS)

Full Text Available
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery

Zhou, Hangyu; Kao, Chia-Hsiang; Phoo, Cheng Perng; Mall, Utkarsh; Hariharan, Bharath; Bala, Kavita (December 2024, NeurIPS 2024)

Clouds in satellite imagery pose a significant challenge for downstream applica- tions. A major challenge in current cloud removal research is the absence of a comprehensive benchmark and a sufficiently large and diverse training dataset. To address this problem, we introduce the largest public dataset — AllClear for cloud removal, featuring 23,742 globally distributed regions of interest (ROIs) with diverse land-use patterns, comprising 4 million images in total. Each ROI includes complete temporal captures from the year 2022, with (1) multi-spectral optical im- agery from Sentinel-2 and Landsat 8/9, (2) synthetic aperture radar (SAR) imagery from Sentinel-1, and (3) auxiliary remote sensing products such as cloud masks and land cover maps. We validate the effectiveness of our dataset by benchmarking performance, demonstrating the scaling law — the PSNR rises from 28.47 to 33.87 with 30× more data, and conducting ablation studies on the temporal length and the importance of individual modalities. This dataset aims to provide comprehensive coverage of the Earth’s surface and promote better cloud removal results.
more » « less
Full Text Available
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery

Zhou, Hangyu; Kao, Chia-Hsiang; Phoo, Cheng Perng; Mall, Utkarsh; Hariharan, Bharath; Bala, Kavita (December 2024, NeurIPS)

Full Text Available
Change-Aware Sampling and Contrastive Learning for Satellite Images

https://doi.org/10.1109/CVPR52729.2023.00509

Mall, Utkarsh; Hariharan, Bharath; Bala, Kavita (June 2023, Conference: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))
Reconstructing Translucent Objects using Differentiable Rendering

https://doi.org/10.1145/3528233.3530714

Deng, Xi; Luan, Fujun; Walter, Bruce; Bala, Kavita; Marschner, Steve (July 2022, SIGGRAPH '22: ACM SIGGRAPH 2022 Conference Proceedings)

Inverse rendering is a powerful approach to modeling objects from photographs, and we extend previous techniques to handle translucent materials that exhibit subsurface scattering. Representing translucency using a heterogeneous bidirectional scattering-surface reflectance distribution function (BSSRDF), we extend the framework of path-space differentiable rendering to accommodate both surface and subsurface reflection. This introduces new types of paths requiring new methods for sampling moving discontinuities in material space that arise from visibility and moving geometry. We use this differentiable rendering method in an end-to-end approach that jointly recovers heterogeneous translucent materials (represented by a BSSRDF) and detailed geometry of an object (represented by a mesh) from a sparse set of measured 2D images in a coarse-to-fine framework incorporating Laplacian preconditioning for the geometry. To efficiently optimize our models in the presence of the Monte Carlo noise introduced by the BSSRDF integral, we introduce a dual-buffer method for evaluating the L2 image loss. This efficiently avoids potential bias in gradient estimation due to the correlation of estimates for image pixels and their derivatives and enables correct convergence of the optimizer even when using low sample counts in the renderer. We validate our derivatives by comparing against finite differences and demonstrate the effectiveness of our technique by comparing inverse-rendering performance with previous methods. We show superior reconstruction quality on a set of synthetic and real-world translucent objects as compared to previous methods that model only surface reflection.
more » « less
Full Text Available
Discovering Underground Maps from Fashion

https://doi.org/10.1109/WACV51458.2022.00057

Mall, Utkarsh; Bala, Kavita; Berg, Tamara; Grauman, Kristen (January 2022, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV))

The fashion sense -- meaning the clothing styles people wear -- in a geographical region can reveal information about that region. For example, it can reflect the kind of activities people do there, or the type of crowds that frequently visit the region (e.g., tourist hot spot, student neighborhood, business center). We propose a method to automatically create underground neighborhood maps of cities by analyzing how people dress. Using publicly available images from across a city, our method finds neighborhoods with a similar fashion sense and segments the map without supervision. For 37 cities worldwide, we show promising results in creating good underground maps, as evaluated using experiments with human judges and underground map benchmarks derived from non-image data. Our approach further allows detecting distinct neighborhoods (what is the most unique region of LA?) and answering analogy questions between cities (what is the "Downtown LA" of Bogota?).
more » « less
Full Text Available
Field-Guide-Inspired Zero-Shot Learning

https://doi.org/10.1109/ICCV48922.2021.00941

Mall, Utkarsh; Hariharan, Bharath; Bala, Kavita (October 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV))

Modern recognition systems require large amounts of supervision to achieve accuracy. Adapting to new domains requires significant data from experts, which is onerous and can become too expensive. Zero-shot learning requires an annotated set of attributes for a novel category. Annotating the full set of attributes for a novel category proves to be a tedious and expensive task in deployment. This is especially the case when the recognition domain is an expert domain. We introduce a new field-guide-inspired approach to zero-shot annotation where the learner model interactively asks for the most useful attributes that define a class. We evaluate our method on classification benchmarks with attribute annotations like CUB, SUN, and AWA2 and show that our model achieves the performance of a model with full annotations at the cost of significantly fewer number of annotations. Since the time of experts is precious, decreasing annotation cost can be very valuable for real-world deployment.
more » « less
Full Text Available
AutoPhoto: Aesthetic Photo Capture using Reinforcement Learning

https://doi.org/10.1109/IROS51168.2021.9636788

AlZayer, Hadi; Lin, Hubert; Bala, Kavita (September 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS))

The process of capturing a well-composed photo is difficult and it takes years of experience to master. We propose a novel pipeline for an autonomous agent to automatically capture an aesthetic photograph by navigating within a local region in a scene. Instead of classical optimization over heuristics such as the rule-of-thirds, we adopt a data-driven aesthetics estimator to assess photo quality. A reinforcement learning framework is used to optimize the model with respect to the learned aesthetics metric. We train our model in simulation with indoor scenes, and we demonstrate that our system can capture aesthetic photos in both simulation and real world environments on a ground robot. To our knowledge, this is the first system that can automatically explore an environment to capture an aesthetic photo with respect to a learned aesthetic estimator. Source code is at https://github.com/HadiZayer/AutoPhoto
more » « less
Full Text Available

« Prev Next »

Search for: All records